Excalibur-7b-DPO is a large language model based on the Excalibur-7b foundation model, fine-tuned with Direct Preference Optimization (DPO), focusing on improving dialogue quality and performance in visual application scenarios.
Large Language Model
Transformers